-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated regex derivative engine #5567
Conversation
We lost some rewrites From re.smt2: These should have been rewritten to true:
|
re_rewrites.smt2 exposes a reference count leak. E.g., fix by taking a reference count in callee.
|
Need case for is_of_pred. |
Also need REWRITE_FULL, REWRITE1 is not enough
|
I addressed the above comments, but I probably need to revisit where I need to use mk_der_cond as in the missing case of .is_of_pred, at some point I omitted relying on the BDD-like structure -- wanted to do this in a more principled way I wanted instead to use BDDs based on bits in the characters. But I understand this heuristic helps in general with simplifications and detecting unsat cases. |
bdd "structure" shouldn't matter (much) for of-pred. |
Your suggested fixes actually took care of all the failing builds. I was expecting more failures. I guess something I don't understand (and essentially violated the "protocol") is the BR_REWRITE... status flags. We could maybe spend a meeting (say an hour or so) on these and go over the overarching logic, could also be in person at work. |
src/ast/rewriter/seq_rewriter.cpp
Outdated
rational jv; | ||
if (str().is_extract(t, s, j, k) && m_autil.is_numeral(j, jv) && jv.is_nonneg()) { | ||
unsigned i; | ||
expr_ref k_min_1(str().is_len_sub(k, l, s_, i) && s == s_ ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to rewrite this
- pre-condition jv.is_unsigned() allows jv.get_unsigned(). Pre-condition jv.is_nonneg() does not allow jv.get_uint32()
- review for when indices are negative
src/ast/rewriter/seq_rewriter.cpp
Outdated
if (str().is_extract(t, s, j, k) && m_autil.is_numeral(j, jv) && jv.is_nonneg()) { | ||
unsigned i; | ||
// k = |s_| - i, l = |s_| implies that lastpos = j + k - 1 = k + j - 1 = j + l - i - 1 = l + j - i - 1 | ||
expr_ref lastpos(str().is_len_sub(k, l, s_, i) && s == s_ ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to rewrite this
pre-condition jv.is_unsigned() allows jv.get_unsigned(). Pre-condition jv.is_nonneg() does not allow jv.get_uint32()
review for when indices are negative
expr_ref one(m_autil.mk_int(1), m()); | ||
if (str().is_extract(t, s, j, k)) | ||
result = str().mk_substr(s, j, m_autil.mk_sub(k, one)); | ||
if (str().is_extract(t, s, j, k)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
something required inline simplification here. It might be too brittle to assume results of such functions are simplified.
@@ -2881,7 +2896,8 @@ br_status seq_rewriter::mk_re_reverse(expr* r, expr_ref& result) { | |||
br_status seq_rewriter::mk_re_derivative(expr* ele, expr* r, expr_ref& result) { | |||
result = mk_derivative(ele, r); | |||
// TBD: we may even declare BR_DONE here and potentially miss some simplifications | |||
return re().is_derivative(result) ? BR_DONE : BR_REWRITE_FULL; | |||
// return re().is_derivative(result) ? BR_DONE : BR_REWRITE_FULL; | |||
return BR_DONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was changed, because...
src/ast/rewriter/seq_rewriter.cpp
Outdated
expr_ref seq_rewriter::mk_derivative(expr* ele, expr* r) { | ||
STRACE("seq_verbose", tout << "derivative: " << mk_pp(ele, m()) | ||
/*STRACE("seq_verbose", tout << "derivative: " << mk_pp(ele, m()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented out code
src/ast/rewriter/seq_rewriter.cpp
Outdated
} | ||
|
||
expr_ref seq_rewriter::mk_antimirov_deriv(expr* e, expr* r, expr* path) { | ||
// Take reference count of path and r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not reference count on e?
src/ast/rewriter/seq_rewriter.cpp
Outdated
result = re().mk_ite_simplify(m().mk_eq(e, h), re().mk_to_re(t), nothing()); | ||
} | ||
else | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
formatting
result = nothing(); | ||
else | ||
// observe that the precondition |r1|>0 of mk_seq_rest is implied by c1 | ||
result = re().mk_ite_simplify(c1, re().mk_reverse(re().mk_to_re(mk_seq_butlast(r1))), nothing()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_br.mk_ite
result = mk_antimirov_deriv_union(c1, re().mk_ite_simplify(r1nullable, mk_antimirov_deriv(e, r2, path), nothing())); | ||
} | ||
else if (m().is_ite(r, c, r1, r2)) { | ||
c1 = simplify_path(m().mk_and(c, path)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sometimes called, other times using m_bv.mk_and.
Should the path expression be considered as an ADT?
result = mk_antimirov_deriv_intersection( | ||
mk_antimirov_deriv(e, r1, path), | ||
mk_antimirov_deriv_negate(mk_antimirov_deriv(e, r2, path)), m().mk_true()); | ||
else if (re().is_of_pred(r, r1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hoist to self-contained function that gets used also the other place
// Take reference count of r and d | ||
expr_ref _r(r, m()), _d(d, m()); | ||
expr* c, * t, * e; | ||
if (m().is_ite(d, c, t, e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is cost model for this rewrite?
src/ast/rewriter/seq_rewriter.cpp
Outdated
} | ||
|
||
expr_ref seq_rewriter::mk_antimirov_deriv_union(expr* d1, expr* d2) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format
// s in [] <==> false, also: s in () <==> false when |s|>0 | ||
result = m().mk_false(); | ||
else if (m().is_ite(d, c, d1, d2)) | ||
result = re().mk_ite_simplify(c, mk_in_antimirov_rec(s, d1), mk_in_antimirov_rec(s, d2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m_br.mk_ite
return result; | ||
} | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems more reasonable to have a path.push method that does this instead of creating mk_and then simplifying
Rewrote/updated several derivate related algos both for correctness and performance.
Main changes involve more efficient use of algebraic laws and avoidance of unnecessary rewrites that in many cases were causing orders of magnitude of overhead.
Several open issues should be addressed by the updated engine.
Some old code is still there for convenience to debug possible regressions, that will eventually be removed, once any new regression bugs have been flushed out.